Speech enhancement by low-rank and convolutive dictionary spectrogram decomposition

نویسندگان

  • Zhuo Chen
  • Brian McFee
  • Daniel P. W. Ellis
چکیده

A successful speech enhancement system requires strong models for both speech and noise to decompose a mixture into the most likely combination. However, if the noise encountered differs significantly from the system’s assumptions, performance will suffer. In previous work, we proposed a speech enhancement framework based on decomposing the noisy spectrogram into low rank background noise and a sparse activation of prelearned templates, which requires few assumptions about the noise and showed promising results. However, when the noise is highly non-stationary or has large amplitude, the local SNR of the noisy speech can change drastically, resulting in less accurate decompositions between foreground speech and background noise. In this work, we extend the previous model by changing the modeling of the speech part to be the convolution of a sparse activation and pre-learned template patches, which enforces continuous structure within the speech and leads to better results in highly corrupted noisy mixtures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

Semi-supervised Speech Enhancement in Modulation Subspace

Previous studies show that existing speech enhancement algorithms can improve speech quality but not speech intelligibility. In this study, we propose a modulation subspace (MS) based speech enhancement framework, in which the spectrogram of noisy speech is decoupled as the product of a spectral envelop subspace and a spectral details subspace. This decoupling approach provides a method to spec...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

Low-Rank Time-Frequency Synthesis

Many single-channel signal decomposition techniques rely on a low-rank factorization of a time-frequency transform. In particular, nonnegative matrix factorization (NMF) of the spectrogram – the (power) magnitude of the short-time Fourier transform (STFT) – has been considered in many audio applications. In this setting, NMF with the Itakura-Saito divergence was shown to underly a generative Ga...

متن کامل

Speech Enhancement by Online Non-negative Spectrogram Decomposition in Non-stationary Noise Environments

Classical single-channel speech enhancement algorithms have two convenient properties: they require pre-learning the noise model but not the speech model, and they work online. However, they often have difficulties in dealing with non-stationary noise sources. Source separation algorithms based on nonnegative spectrogram decompositions are capable of dealing with non-stationary noise, but do no...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014